Overview

Dataset Statistics

Number of Variables 19
Number of Rows 1000
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 640.2 KB
Average Row Size in Memory 655.5 B
Variable Types
  • Categorical: 9
  • Numerical: 9
  • DateTime: 1

Dataset Insights

Tax 5% and gross income have similar distributions Similar Distribution
Total and cogs have similar distributions Similar Distribution
Invoice ID has a high cardinality: 1000 distinct values High Cardinality
Time has a high cardinality: 506 distinct values High Cardinality
Invoice ID has constant length 11 Constant Length
Branch has constant length 1 Constant Length
Customer type has constant length 6 Constant Length
Time has constant length 5 Constant Length
Month has constant length 1 Constant Length
Invoice ID has all distinct values Unique

Variables


Invoice ID

categorical

Approximate Distinct Count 1000
Approximate Unique (%) 100.0%
Missing 0
Missing (%) 0.0%
Memory Size 76000

Length

Mean 11
Standard Deviation 0
Median 11
Minimum 11
Maximum 11

Sample

1st row 750-67-8428
2nd row 226-31-3081
3rd row 631-41-3108
4th row 123-19-1176
5th row 373-73-7910

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 2000
Decimal Number 9000
  • Invoice ID has words of constant length

Branch

categorical

Approximate Distinct Count 3
Approximate Unique (%) 0.3%
Missing 0
Missing (%) 0.0%
Memory Size 66000

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row A
2nd row C
3rd row A
4th row A
5th row A

Letter

Count 1000
Lowercase Letter 0
Space Separator 0
Uppercase Letter 1000
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (A, B) take over 50.0%
  • Branch has words of constant length

City

categorical

Approximate Distinct Count 3
Approximate Unique (%) 0.3%
Missing 0
Missing (%) 0.0%
Memory Size 72648

Length

Mean 7.648
Standard Deviation 1.2513
Median 8
Minimum 6
Maximum 9

Sample

1st row Yangon
2nd row Naypyitaw
3rd row Yangon
4th row Yangon
5th row Yangon

Letter

Count 7648
Lowercase Letter 6648
Space Separator 0
Uppercase Letter 1000
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Yangon, Mandalay) take over 50.0%

Customer type

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.2%
Missing 0
Missing (%) 0.0%
Memory Size 71000

Length

Mean 6
Standard Deviation 0
Median 6
Minimum 6
Maximum 6

Sample

1st row Member
2nd row Normal
3rd row Normal
4th row Member
5th row Normal

Letter

Count 6000
Lowercase Letter 5000
Space Separator 0
Uppercase Letter 1000
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Member, Normal) take over 50.0%
  • Customer type has words of constant length

Gender

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.2%
Missing 0
Missing (%) 0.0%
Memory Size 70002

Length

Mean 5.002
Standard Deviation 1.0005
Median 6
Minimum 4
Maximum 6

Sample

1st row Female
2nd row Female
3rd row Male
4th row Male
5th row Male

Letter

Count 5002
Lowercase Letter 4002
Space Separator 0
Uppercase Letter 1000
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Female, Male) take over 50.0%

Product line

categorical

Approximate Distinct Count 6
Approximate Unique (%) 0.6%
Missing 0
Missing (%) 0.0%
Memory Size 83540

Length

Mean 18.54
Standard Deviation 1.7109
Median 18
Minimum 17
Maximum 22

Sample

1st row Health and beauty
2nd row Electronic accesso...
3rd row Home and lifestyle
4th row Health and beauty
5th row Sports and travel

Letter

Count 16888
Lowercase Letter 15888
Space Separator 1652
Uppercase Letter 1000
Dash Punctuation 0
Decimal Number 0
  • The largest value (accessories) is over 1.96 times larger than the second largest value (fashion)

Unit price

numerical

Approximate Distinct Count 943
Approximate Unique (%) 94.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 16000
Mean 55.6721
Minimum 10.08
Maximum 99.96
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Unit price is skewed right (γ1 = 0.0071)

Quantile Statistics

Minimum 10.08
5-th Percentile 15.279
Q1 32.875
Median 55.23
Q3 77.935
95-th Percentile 97.222
Maximum 99.96
Range 89.88
IQR 45.06

Descriptive Statistics

Mean 55.6721
Standard Deviation 26.4946
Variance 701.9653
Sum 55672.13
Skewness 0.007067
Kurtosis -1.2185
Coefficient of Variation 0.4759
  • Unit price is not normally distributed (p-value 0.0016064029088901917)

Quantity

numerical

Approximate Distinct Count 10
Approximate Unique (%) 1.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 16000
Mean 5.51
Minimum 1
Maximum 10
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Quantity is skewed right (γ1 = 0.0129)

Quantile Statistics

Minimum 1
5-th Percentile 1
Q1 3
Median 5
Q3 8
95-th Percentile 10
Maximum 10
Range 9
IQR 5

Descriptive Statistics

Mean 5.51
Standard Deviation 2.9234
Variance 8.5464
Sum 5510
Skewness 0.01292
Kurtosis -1.2155
Coefficient of Variation 0.5306
  • Quantity is not normally distributed (p-value 0.00024017961746093384)

Tax 5%

numerical

Approximate Distinct Count 879
Approximate Unique (%) 87.9%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 16000
Mean 15.3795
Minimum 0.51
Maximum 49.65
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Tax 5% is skewed right (γ1 = 0.8912)

Quantile Statistics

Minimum 0.51
5-th Percentile 1.9595
Q1 5.9275
Median 12.09
Q3 22.445
95-th Percentile 39.171
Maximum 49.65
Range 49.14
IQR 16.5175

Descriptive Statistics

Mean 15.3795
Standard Deviation 11.7088
Variance 137.097
Sum 15379.51
Skewness 0.8912
Kurtosis -0.08751
Coefficient of Variation 0.7613
  • Tax 5% is not normally distributed (p-value 0.0030122073741519545)
  • Tax 5% has 9 outliers

Total

numerical

Approximate Distinct Count 990
Approximate Unique (%) 99.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 16000
Mean 322.9668
Minimum 10.68
Maximum 1042.65
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Total is skewed right (γ1 = 0.8912)

Quantile Statistics

Minimum 10.68
5-th Percentile 41.074
Q1 124.425
Median 253.85
Q3 471.35
95-th Percentile 822.501
Maximum 1042.65
Range 1031.97
IQR 346.925

Descriptive Statistics

Mean 322.9668
Standard Deviation 245.8854
Variance 60459.6287
Sum 322966.82
Skewness 0.8912
Kurtosis -0.08747
Coefficient of Variation 0.7613
  • Total is not normally distributed (p-value 0.004207220919607963)
  • Total has 9 outliers

Date

datetime

Distinct Count 88.0591
Approximate Unique (%) 8.8%
Missing 0
Missing (%) 0.0%
Memory Size 8128
Minimum 2019-01-01 00:00:00
Maximum 2019-03-30 00:00:00

Time

categorical

Approximate Distinct Count 506
Approximate Unique (%) 50.6%
Missing 0
Missing (%) 0.0%
Memory Size 70000

Length

Mean 5
Standard Deviation 0
Median 5
Minimum 5
Maximum 5

Sample

1st row 13:08
2nd row 10:29
3rd row 13:23
4th row 20:33
5th row 10:37

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 4000
  • Time has words of constant length

Payment

categorical

Approximate Distinct Count 3
Approximate Unique (%) 0.3%
Missing 0
Missing (%) 0.0%
Memory Size 72212

Length

Mean 7.212
Standard Deviation 2.8346
Median 7
Minimum 4
Maximum 11

Sample

1st row Ewallet
2nd row Cash
3rd row Credit card
4th row Ewallet
5th row Ewallet

Letter

Count 6901
Lowercase Letter 5901
Space Separator 311
Uppercase Letter 1000
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Ewallet, Cash) take over 50.0%

cogs

numerical

Approximate Distinct Count 990
Approximate Unique (%) 99.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 16000
Mean 307.5874
Minimum 10.17
Maximum 993
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • cogs is skewed right (γ1 = 0.8912)

Quantile Statistics

Minimum 10.17
5-th Percentile 39.1145
Q1 118.4975
Median 241.76
Q3 448.905
95-th Percentile 783.33
Maximum 993
Range 982.83
IQR 330.4075

Descriptive Statistics

Mean 307.5874
Standard Deviation 234.1765
Variance 54838.6377
Sum 307587.38
Skewness 0.8912
Kurtosis -0.08747
Coefficient of Variation 0.7613
  • cogs is not normally distributed (p-value 0.004207220919607963)
  • cogs has 9 outliers

gross income

numerical

Approximate Distinct Count 879
Approximate Unique (%) 87.9%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 16000
Mean 15.3795
Minimum 0.51
Maximum 49.65
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • gross income is skewed right (γ1 = 0.8912)

Quantile Statistics

Minimum 0.51
5-th Percentile 1.9595
Q1 5.9275
Median 12.09
Q3 22.445
95-th Percentile 39.171
Maximum 49.65
Range 49.14
IQR 16.5175

Descriptive Statistics

Mean 15.3795
Standard Deviation 11.7088
Variance 137.097
Sum 15379.51
Skewness 0.8912
Kurtosis -0.08751
Coefficient of Variation 0.7613
  • gross income is not normally distributed (p-value 0.0030122073741519545)
  • gross income has 9 outliers

Rating

numerical

Approximate Distinct Count 61
Approximate Unique (%) 6.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 16000
Mean 6.9727
Minimum 4
Maximum 10
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Rating is skewed right (γ1 = 0.009)

Quantile Statistics

Minimum 4
5-th Percentile 4.295
Q1 5.5
Median 7
Q3 8.5
95-th Percentile 9.7
Maximum 10
Range 6
IQR 3

Descriptive Statistics

Mean 6.9727
Standard Deviation 1.7186
Variance 2.9535
Sum 6972.7
Skewness 0.008996
Kurtosis -1.1518
Coefficient of Variation 0.2465
  • Rating is not normally distributed (p-value 0.0005242112669547172)

Month

categorical

Approximate Distinct Count 3
Approximate Unique (%) 0.3%
Missing 0
Missing (%) 0.0%
Memory Size 66000

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 1
2nd row 3
3rd row 3
4th row 1
5th row 2

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 1000
  • The top 2 categories (1, 3) take over 50.0%
  • Month has words of constant length

Hour

numerical

Approximate Distinct Count 11
Approximate Unique (%) 1.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 16000
Mean 14.91
Minimum 10
Maximum 20
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Hour is skewed right (γ1 = 0.0257)

Quantile Statistics

Minimum 10
5-th Percentile 10
Q1 12
Median 15
Q3 18
95-th Percentile 20
Maximum 20
Range 10
IQR 6

Descriptive Statistics

Mean 14.91
Standard Deviation 3.1869
Variance 10.1561
Sum 14910
Skewness 0.02575
Kurtosis -1.2562
Coefficient of Variation 0.2137
  • Hour is not normally distributed (p-value 0.0007584741625140728)

Hour_12

numerical

Approximate Distinct Count 11
Approximate Unique (%) 1.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 16000
Mean 6.27
Minimum 1
Maximum 12
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Hour_12 is skewed right (γ1 = 0.1291)

Quantile Statistics

Minimum 1
5-th Percentile 1
Q1 3
Median 6
Q3 10
95-th Percentile 12
Maximum 12
Range 11
IQR 7

Descriptive Statistics

Mean 6.27
Standard Deviation 3.5533
Variance 12.6257
Sum 6270
Skewness 0.1291
Kurtosis -1.2263
Coefficient of Variation 0.5667
  • Hour_12 is not normally distributed (p-value 0.0007584741625140728)

Interactions

Correlations

Missing Values